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ABSTRACT 

Among the most critical professional characteristics of teacher educators is that of reflectivity. The ability to self-judge 
our own practice context, capability, and performance against the broader professional contexfs of practice by 
teacher educators has been noted by the National Council for Accreditation of Teacher Education (NCATEj. The 
capacity for teacher educators to demonstrate professional reflection and to inculcate this capacity in pre-licensure 
candidates in colleges of education is among the standards for accreditation in the NCATE criteria (NCATE, Standard 2). 
As a conseguence, research designed to uncover this reflective capacity, to scale it for comparative study, and to relate 
it to standard measures of program guality are viewed as critical to a more realistic understanding of the capability of 
faculty in higher education (teacher educators] to meet the reform goals for K-12 education broadly. 

The purpose of this study was to determine whether it was possible to distinguish among reflective strategies of teacher 
educators' divergent types or levels of reflecfive practice. The findings indicated that The Reflective Judgment Model 
(King and Kitchener, 1994] is a reliable and valid conceptual model; therefore it would be appropriate to directly 
compare reflective scores for feacher educators to other professions which have been studied with this same RJM. Itwas 
determined that teacher educators were more typically at the center of the epistemic scale. Given this finding, there is 
room for professional development work to enhance the evolution of teacher educators with respect to reflective 
capacity. 
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INTRODUCTION 

The centrality of reflection remains a goal of education, 
especially higher education; this is evident in several recent 
national reports on undergraduate education, each of 
which reiterated the need for college graduates to think 
reflectively (American Association of Colleges and 
Universities, AAC & U, 2002; American Association of Higher 
Education, American College Personnel Association ACPA, 
and National Association of Student Personnel 
Administrators, 1998; ACPA, 1994, as cited in King and 
Kitchener, 2004, p.6). 

Among the most critical professional characteristics of 
teacher educators is that of reflectivity. The ability to self¬ 
judge our own practice context, capability, and 
performance against the broader professional contexts of 


practice by teacher educators has been noted by the 
National Council for Accreditation of Teacher Education 
(NCATE). The capacity for teacher educators to 
demonstrate professional reflection and to inculcate this 
capacity in pre-licensure candidates in colleges of 
education is among the standards for accreditation in the 
NCATE criteria (NCATE, Standard 2). As a consequence, 
research designed to uncover this reflective capacity, to 
scale it for comparative study, and to relate it to standard 
measures of program quality are viewed as critical to a 
more realistic understanding of the capability of faculty in 
higher education (teacher educators) to meet the reform 
goals for K-12 education broadly. 

Unfortunately, traditional models of professional 
development for educators have been built from a 
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cognition model in isolation from the increasingly complex 
practice environment where decision-making is clouded 
by conflicting policy and socio-cultural constraints, 
although numerous calls to reform have been issued 
repeatedly. Butler (2004) notes a further deficiency with 
respect to professional development, "a related criticism of 
the traditional model is that it is based on questionable 
assumptions about the nature and origins of professional 
knowledge, and about how to forge connections between 
research and practice" (p.437). In this gap, as has been 
noted frequently, what passes for educational 
development is typically disjointed, incoherent, and 
unconnected from authentic professional decision¬ 
making responsibilities for educators at all levels (Corcoran, 
1995; Day, 1993; Livneh, 1999). 

Studies Using King and Kitchener's Model 

Research has consistently demonstrated a significant 
relationship between educational level and a person's 
ability to make reflective judgments. According to 
Friedman and others, those with more formal education 
are more likely than those with less education to exhibit the 
most complex types of thinking described in King and 
Kitchener's reflective judgment model (RJM) (Friedman, 
2004, p. 297). 

Although often compared with critical thinking, the RJM is 
distinct in its emphasis on the intellectual tasks involved in 
open-ended problem solving rather than closed-ended, 
the attention to epistemic assumptions, and the 
articulation of stages of development (Hofer, 2001). Ill- 
defined problems, according to King and Kitchener (2004, 
p, 5) are characterized by two features: they cannot be 
defined with a high degree of completeness and they 
cannot be solved with a high degree of certainty. 

After twenty-five years of investigating how late adolescents 
and adults come to understand and make judgments about 
kinds of controversial problems, three observations have 
been made by King and Kitchener: 

• there are striking differences in people's underlying 
assumptions about knowledge or epistemic 
assumptions, 

• these differences in assumptions are related to the way 
people make and justify their own judgments about ill 


structured problems, and, 

• there is a developmental sequence in the patterns of 
responses and judgments about such problems. 

The RJM provides a theoretical framework for understanding 
and organizing these observations (2004, p. 5). 

King and Kitchener (2004) also observed that 
development in reasoning has stage-like properties, but 
not that it evolves in a lock step, one stage at a time fashion 
(p. 9). For example, it is common to find an individual who 
relies heavily on Stage 4 assumptions while reasoning 
about a controversial problem but who also makes 
statements that are consistent with Stage 3 and Stage 5 
assumptions. By contrast, someone who relies heavily on 
Stage 2 assumptions rarely uses assumptions of any stage 
higher than Stage 3 (p. 10). 

King and Kitchener's (1994) general findings from their ten 
year, longitudinal and cross-sectional studies are as follows: 

• Development in reflective judgment occurs slowly and 
steadily over time and the increases in scores are not 
an artifact of selective participation or practice. 

• Stability and development are much more common 
than regression in reflective thinking. 

• People who are engaged in educational activities 
tend to improve in their reasoning about ill-structured 
problems. 

• Development typically follows the stage-related 
patterns described by the RJM. The consistently higher 
mean scores among older, more highly educated 
individuals in the cross-sectional studies, the consistent 
increase in mean scores over time in the longitudinal 
studies, and the more fine-grained analyses of the 
sequence of changes within individuals support this 
claim. 

• Being in an educational setting seems to facilitate 
development; the specific components of an 
educational environment that make a difference 
could not be determined (pp. 187-188). 

Additional researchers have used King and Kitchener's 
Reflective Judgment Model with a variety of populations. 
Janet Dale (2005) completed a study in which the 
participants were students preparing for ministry. The results of 
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this study indicated that differences between entering and 
graduating students' Reflective Judgment Interview (RJI) 
mean scores were not statistically significant, nor were their 
mean scores significant between religious and secular 
dilemmas. Further, students' scores did not decrease 
significantly as their references to faith increased (pp. 60-63). 

Friedman (2004) interviewed female students using the 
Omnibus Personality Inventory and the Reflective 
Judgment Interview and found that scores on six scales of 
the personality inventory correlated significantly with RJI 
scores; these include thinking introversion, response bias, 
altruism, autonomy, complexity, and theoretical 
orientation. These findings support the conclusion that post 
formal reasoning, as described by King and Kitchener's 
model, is related to measurable personality traits (pp. 301 - 
303). 

Ilacqua and Prescott (2003) used the Reflective Judgment 
Model in their introductory economics courses and found 
that older students were more comfortable with uncertainty 
and complexity and more flexible in their interpretation of 
knowledge than the younger students (pp. 368-369). 

Pirttila-Backman and Kajanne (2001) published results 
which focused on Finnish adults. The RJM average stage 
score clearly increased during the two study periods; one 
initially given in the late 80's and a follow up in the mid 90's. 
Education, in particular, education beyond a person's 
primary profession/occupation was a strong predictor of 
development. Also, encountering diversity and exploratory 
orientation were related to development, but their 
connections were more complicated. No gender 
differences were found. The results support the idea that 
positive changes in thinking and reasoning take place 
during adulthood (pp. 89-91), 

Pirttila-Backman (1993) completed a Finnish cross- 
sectional study in which it was shown that both educational 
level (lower vocational, higher vocational and university) 
and field (technical, nursing/medical and social sciences) 
make a difference in the RJ scores. It was further shown that 
such factors as living in a complex environment, being 
responsible for other people and having autonomy in one's 
work seem to be related to the development of RJ. The 
lower one's education level, the more important are other 


life experiences (as cited in Pirttila-Backman & Kajanne, 

2001, p. 82). 

Reflective judgment also appears to be related to other 
dimensions of development. King and Shuford (1996) 
found a moderate positive relationship between the kinds 
of assumptions students use to reason about intellectual 
issues and the assumptions they use to reason about moral 
issues. Guthrie, King and Palmer (1999) found moderate 
positive correlations between reflective thinking and 
tolerance for diversity. Participants in this study who 
reasoned at quasi and reflective thinking levels were much 
more likely to hold tolerant viewpoints with respect to race 
and sexual orientation than their counterparts who help 
pre-reflective assumptions (as cited in King & Kitchener, 
2004, p. 22). 

The strongest contrast between college-educated and 
non-college educated adults is provided by Glenn and 
Eklund (1991), These researchers administered the RJI to 
two groups of participants who were at least 65 years old 
but who differed in terms of their educational attainment. 
The first group consisted of adults with up to a high school 
education: their RJI mean score was 3.7, which is about 
half a stage higher than the overall mean score among 
high school seniors (3.3) and closer to the average for the 
college samples (3.8). The second group consisted of 
retired faculty members with doctorates; the RJI mean 
score for this group was 5.2, which is comparable to the 
scores earned by advanced graduate students (as cited in 
King and Kitchener, 1994, pp. 1 74). 

Methodology 

In 2005, the authors of this current study undertook a 
complex study of the reflective capacity of teacher 
educators at a regional college in the mid-western United 
States. Prior findings from this research have included a 
consistent, event-path model describing the processes of 
reflection incorporated by these teacher educators in 
making judgments about their own practices (Wlodarsky 
and Walters, 2006). Further analyses revealed a strong, 
cognitive and performance basis to reflection and a 
tendency to prioritize personal experience and memory 
over more objective evidence when reflecting (Wlodarsky 
and Walters, 2007). Clearly, however, reflection on practice 
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seemed, in these studies, to be a typical element in 
professional practice for these teacher educators. It was 
not clear, from these earlier analyses, whether it was 
possible to distinguish divergent types or levels of reflective 
practice among reflective strategies of teacher educators, 
necessitating further data collection and analyses. 

This current paper followed up the prior work by 
incorporating structured interview processes and a field- 
validated approach to scaling reflective practice 
developed by King and Kitchener (1994) that had not 
previously been used with teacher educators in a college 
setting. The Reflective Judgment Model (RJM), discussed in 
the literature review above, has been found reliable for 
linking respondent narrative regarding ill-defined problems 
to a validated stage of reflective capacity. A number of 
other professional groups, various ages and education 
levels of participants, as well as demographic criteria have 
been incorporated in research using this model. The 
interview subjects had previously provided survey 
responses and artifacts to us for analyses in the prior 
research studies, and had indicated a willingness to further 
participate in this ongoing research study. This study utilized 
a mixed method model, wherein the narrative data were 
coded separately from each other following a definitional 
matrix (rubric) developed for the stages of reflective 
practice. Following coding of one interview transcript, the 
authors discussed the use of the coding schema to isolate 
deviations in definitions within the matrix and to identify a 
baseline inter-rater reliability level. Following this step, they 
coded the remaining interview transcripts, with the codes 
interpreted as nominal data scale. These mathematical 
data were then input into SPSS and used for correlational 
analyses to identify patterns of response, inter-rater 
reliability of the coding schema and definitional matrix, 
and subsequently the overall and within subjects' 
differences on the reflective judgment scale. 

Reliability and Validity 

The functional reliability and validity of the questioning / 
instrument has been calculated and reported by King and 
Kitchener (1994, pp. 268-270) across thirty-two replication 
studies. The interrater reliabilities range from a low of .29 to 
a high score of .97, with twenty-four of the studies reporting 


reliability coefficients in the upper quartile, The current study 
falls within typical values for these studies (at the high end). 
Internal reliability of the standard questions has been 
calculated across the thirty-two studies (King and Kitchener, 
1994, pp. 271-274) with a range in alpha coefficients from 
.47 to .96. Internal reliability for this current administration of 
the questions yielded an alpha coefficient of .93, again 
within but at the high end of the range of scores for the 
previous studies. It is noted that the inter-problem 
correlations from the previous studies addressed only the 
five standard problems from the question protocol, 
whereas the authors used one standard question and one 
discipline specific question from the psychology- 
disciplinary battery because of the professional knowledge 
of the study group; however, they confined themselves to 
the exact administration procedures delineated by King 
and Kitchener to ensure there were no threats to the 
reliability and validity of their questions risked by changing 
their original procedures. As they restricted themselves to 
the exact wording of the original questions, the face and 
construct validity of these questions established in the 
original King and Kitchener studies and the thirty-two 
replication studies cited in this section of their text is 
preserved. Finally, one additional study (Glenn and Eklund, 
1991, cited above) utilized this structured interview 
procedure and questions with college faculty members, 
albeit retired faculty (different from our population of active 
faculty). 

Procedures 

A sample of eight teacher educators in a regional, mid- 
western college self-selected to participate in structured 
interviews with us. They were not informed about the nature 
of the interviews nor the RJM until after the interviews were 
completed. Each participant was invited to respond to an 
initial, ill-defined problem from the set provided by King 
and Kitchener (1994), following the scripted questions 
recommended for this interview protocol (1994, pp.102- 
103). Interviews were recorded digitally and transcribed 
completely for content analyses. 

Analyses and Findings 

The authors coded narrative responses for the eight 
interview subjects using a matrix of epistemic categories. 
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This matrix (Table 1) includes an ordinal scale from one to 
seven for levels of Knowledge and Judgment related to 
personal epistemology and decision-making. They worked 
independently on the first interview narrative to code the 
interview responses for both Knowledge and Judgment 
level. They then compared the respective coding 
schemes and discussed their individual use of the 
categories to ensure a shared conceptualization of the 
embedded meaning of each level. Finally, they then 
worked independently of each other to code and rank the 
narrative for the remaining seven interviewees. SPSS 16.0 
was used to calculate descriptive and inferential values for 
the data set. 

A group of ninety-five cases were constructed for analyses 
from narrative quotations in the transcripts which were 
identified by both of the authors from the eight transcripts. 
Quotations identified by only one of them were discarded. 
Each case (quotation) was coded for Knowledge and for 
Judgment by each of them, yielding four scores per case. It 
should be noted that an individual narrative selection may 
have had different scores for Knowledge and for Judgment 
(one to seven on each scale), and may have had different 
scores from one to seven from each of them. The cases 
were coded on a scale of one to seven, corresponding to 
the definitional scale (Table 1). 

A score was assigned when a word, phrase, or paragraph 
seemed to stand as a single thought, i.e. was a single, 
"countable" unit of thought-and when this thought was 
comprised of language that resonated qualitatively with 


Stage 

View of Knowledge 

Concept of Justification 

1 

1K: Absolute, Concrete; 
External Authority. 

1J: Beliefs need no justification; No 
alternatives are perceived. 

2 

2K: Absolute but Partial; 
External Authority. 

2J: Existence of alternative views is 
acknowledged however, absolute 
knowledge is still maintained. There is 
a right way to believe. 

3 

3K: Absolute, Uncertainly is 
temporary until external 
authority finds truth. 

3J: Beliefs are justified by reference to 
an authority's view. 

4 

4K: Uncertain; ambiguous. 

4J: Beliefs are justified by reasons and 
using evidence. 

5 

5K: Contextual; Subjective. 

5J: Beliefs are justified within a 
particular context. 

6 

6K: Constructed from a 
variely of sources. 

6J: Beliefs are justified by comparing 
evidence and opinion across different 
contexts. 

7 

7K: Constructed through a 
process of inquiry. 

7J: Beliefs are justified probabilistically 
based on a variety of interpretive 
considerations. 


Table 1. King and Kitchener's Seven Stages of Reflective Judgment 


the ideas contained within the descriptions in the cells in 
Table 1. Again, it is noted that the authors worked 
independently through one interview transcript, compared 
their coding structure for similarity and differences to solidify 
and stabilize the use of the definitional matrix as a rubric, 
and then proceeded to code the remaining seven 
transcripts once a high level of consistently was achieved 
when working through the first transcript. The final statistical 
analyses were calculated both with and without the scores 
for the first transcript, and it was found that the analyses with 
all eight of the transcripts was the most statistically 
conservative-and therefore these are the ones reported in 
this study. 

Each of the ninety-five coses included four scores 
(Researcher 1, Knowledge and Judgment, and 
Researcher 2, Knowledge and Judgment). The inter-rater 
reliability of the independently coded scores was 
calculated using Cronbach's alpha at .93 overall (Table 2). 
This is a very high level of consistency among the cases and 
scores, suggesting that the definitions and language on the 
reflective scale were robust to accommodate the type of 
language typically used by teacher educators to discuss 
the ill-defined problems. This supports a conclusion that this 
measurement scale is valid and appropriate to use in 
working with teacher educators and to describe reflection 
specific to that professional field. 

Within the cases, the individual item descriptive statistics 
(Table 3) revealed a fairly small variability around a similar 
mean score of approximately 4.2 to 4.3. Overall, these 
scores place the group of eight participants at slightly 
above average on the reflective scale, or slightly to the 
constructivist orientation over against the objectivist 
orientation. The researchers expected a more highly 


Cronbach's Alpha 

Cronbach's Alpha Based on 
Standardized Items 

N of Items 


.927 


.930 

4 



Table 2. 

Inter-Rater Reliability 




Mean 

Std. Deviation 

N 

IK 


4.22 

1.354 

95 

1J 


4.26 

1.354 

95 

2K 


4.32 

1.132 

95 

2J 


4.34 

1.107 

95 


Table 3. Item Statistics (Item = rating) 
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constructivist score, i.e, closer to a mean score of 5.0 or 6.0 
given cultural perceptions of teacher educators, as well as 
a prior study (Glenn and Eklund, 1991) that used this scale 
with retired college faculty members (doctoral level) and 
found an overall mean score of 5.20. 

Post hoc testing was performed to ascertain the 
relationship among the four scores used in this analysis. 
There were no statistically significant differences between 
the items based on the ANOVA (p=.549), which was 
expected based on the very high Cronbach score for the 
data set (Table 4). This finding confirms the highly similar 
rank scores for both Knowledge and Judgment. 

Conclusions and Implications 

Seeking to move from the narrow focus of this research 
problem as outlined above, the authors have made 
several observations and are struck with a number of very 
practical implications in relationship to these observations 
to this body of work. 

First, facilitating and enhancing the capability of teacher 
educators to be reflective and to inculcate reflectivity 
among licensure candidates is critical to the success of the 
teaching profession. Consequently, identifying a reliable 
and valid conceptual model to operationalize and 
measure reflection among these groups is an important 
step to identifying practice solutions that are effective and 
sustainable. The Reflective Judgment Model incorporated 
in this study has been found to be appropriate and reliable, 
and to accommodate the cultural vocabulary of teacher 
educators. The matrix in Table 1, when used as a rubric to 
scale teacher educator reflective capacity was functional 
with a very high measured reliability. Were this type of scale 
used consistently with larger groups of teacher educators 
over time and in various demographic and socio-cultural 
environments, important variables related to the formation 
of reflective capacity among professional educators 




Sum of 
Squares 

df 

Mean 

Square F 

Sig 

Between People 


476.805 

94 

5.072 


Within People 

Between Items 

.779 

3 

.260 .706 

.549 


Residual 

103.721 

282 

.368 



Total 

104.500 

285 

.367 


Total 


581.305 

379 

1.534 



Table 4. ANOVA results for difference between items reveals 
no significant differences. 


might be observed. 

Further, given that the Reflective Judgment Model proved 
reliable for scaling teacher educators' reflective capability, 
it would be appropriate to directly compare reflective 
scores for teacher educators to other professions which 
have been studied with this same RJM. In many areas of 
educational research, traditional research lines have failed 
to yield fruitful and energizing results which hold promise for 
powerful impact on the field of practice. Findings on 
research with other professional groups which used the RJM 
may contribute to a deeper understanding of reflection 
among teacher educators, thereby enhancing and 
facilitating growth in reflection and, subsequently, 
enhance reflective ability among their students, i.e. 
licensure candidates. These findings may also open new 
research lines toward an understanding of the relationship 
of self-awareness to professional competence for teacher 
educators, and how these translate to licensure 
candidates under the direction of these teacher 
educators. 

Second, the authors have clearly observed and cited the 
use of a common instrument and conceptual construct 
that functions reliability across a broad group of 
populations whose commonalities are adult-hood, 
continuous learning beyond necessarily formal or 
institutional settings, and learning in professional contexts. 
This "larger tent" approach to literature has been a hallmark 
of the adult education movement in the United States since 
its inception and as an approach-as the authors are finding 
in this paper-enriches their research and learning. They 
perceive that the failure to incorporate the rich traditions 
and literatures across the fields engaged with adult 
learning has become an obstacle to professional renewal 
and growth in their field, that of teacher education. Within 
their own college setting, the insularity that is produced 
through over-limitation of literary categories, through over¬ 
reliance on literature specific to teacher educators, and 
through an unnecessary delimiting of learning from 
multiple fields of inquiry, is at the very least intellectually 
stifling. 

Pragmatically, there is much the authors can learn about 
themselves as teacher educators if they learn to first view 
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themselves as adult learners generally, with much in 
common with individuals and colleagues from many other 
traditions and contexts, The authors may find solutions to 
what they have construed as unique practice problems to 
teacher education from those other traditions, 

Third, in this study, every participant revealed narrative from 
every level of the RJM. However, a clear preponderance of 
scores revealed an average or typical reflective level of 
slightly higher than 4.0, This observation supports King and 
Kitchener's findings, which observed that individuals would 
have a typical level, while occasionally responding above 
or below that level. However, the authors were surprised to 
observe that typical, cultural characterizations of teacher 
educators, i.e. highly postmodernist and constructivist in 
orientation, did not hold up in this analysis. Teacher 
educators were more typically found to be at the center of 
the epistemic scale. They were comfortable with 
authoritative knowledge, external authority and evidence, 
and objectivity and rationalism as the means to 
understanding, This finding would situate the field of 
teacher education more centrally, philosophically, than 
modern social preconceptions held by the general public. 

Given the relatively mid-range of scores of the faculty 
members who participated in this study and the authors 
perception that they are not atypical of college faculty in 
other institutions, there is room for professional 
development work to enhance the evolution of college 
faculty with respect to personal reflective capacity. There 
was a gap observed in the response scores of their faculty 
members and those obtained by Glenn and Eklund (1991) 
in his study of very late career faculty members. To the 
degree that their participants are similar to the faculty 
studied by Glenn and Eklund, it is important to identify the 
types of professional development that mid- to late-career 
professors might engage in that would result in the kind of 
growth in reflective judgment required to move from the 
approximately 4.0 stage to the high 5.0 range. For their 
faculty, it may be possible to develop a trajectory of growth 
in reflective capacity on the King and Kitchener scale 
based on their current levels, the professional growth 
activities in which they engage, and their similarity or 
difference to the Glenn and Eklund study sample. In this 


research and the literary context they have established 
suggests that structured, formal learning not necessarily 
related to the profession of college professor or teacher 
educator-perhaps more classical, liberal arts, or content in 
nature-would contribute to increasing the reflective 
capacity of their faculty and other faculty who may be like 
these individuals. More broadly construed, and noting that 
the following thought is perhaps fodder for an entirely 
different and lengthy conversation, the ongoing concerns 
over the preparation or fit of teacher educators within the 
academy may also be ameliorated somewhat by the use 
of increased formal learning experiences to broaden and 
deepen the content knowledge of these individuals, and 
thereby also contributing to the creation of a more 
reflective faculty simultaneously. 
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